Circuit design of a dual-versioning L1 data cache

نویسندگان

  • Azam Seyedi
  • Adrià Armejach
  • Adrián Cristal
  • Osman S. Unsal
  • Ibrahim Hur
  • Mateo Valero
چکیده

This paper proposes a novel L1 data cache design with dual-versioning SRAM cells (dvSRAM) for chip multi-processors that implement optimistic concurrency proposals. In this cache architecture, each dvSRAM cell has two cells, a main cell and a secondary cell, which keep two versions of the same logical data. These values can be accessed, modified, moved back and forth between the main and secondary cells within the access time of the cache. We design and simulate a 32 KB dual-versioning L1 data cache and introduce three well-known use cases that make use of optimistic concurrency execution that can benefit from our proposed design. & 2011 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Skewed Distribution of Working Sets: Leveraging Randomness for Cache Design

The increasing gap between processor and memory speeds, as well as the introduction of multicore CPUs, have exacerbated the dependency of CPU performance on the memory subsystem. This trend motivates the search for more efficient caching mechanisms, enabling both faster service of frequently used blocks and decreased power consumption. This thesis explores the temporal locality phenomenon in an...

متن کامل

L1 Cache Decomposition for Energy Efficient Processors

The L1 data cache is a time-critical module and, at the same time, a major source of energy consumption. To reduce its energy-delay product, we apply two principles of low power design: specialize part of the cache structure and break down the cache into smaller caches. To this end, we propose a L1 cache that combines new designs of a stack cache and a PSA cache. Individually, our stack and PSA...

متن کامل

Two-level Data Prefetching

Data prefetching has been shown to be an effective tool in hiding part of the latency associated with cache misses in modern processors. Traditionally, data prefetchers fetch data into a small prefetch buffer near the L1 for low latency, or the L2 cache for greater coverage and less cache pollution. However, with the L1–L2 cache speed gap growing, significant performance gains can be obtained i...

متن کامل

A 400-MHz S/390 Microprocessor - Solid-State Circuits, IEEE Journal of

A microprocessor implementing IBM S/390 architecture operates in a 10 + 2 way system at frequencies up to 411 MHz (2.43 ns). The chip is fabricated in a 0.2m Le CMOS technology with five layers of metal and tungsten local interconnect. The chip size is 17.35 mm 17.30 mm with about 7.8 million transistors. The power supply is 2.5 V and measured power dissipation at 300 MHz is 37 W. The microproc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Integration

دوره 45  شماره 

صفحات  -

تاریخ انتشار 2012